Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(aws_s3 source): Support vhost-style S3 bucket addressing #22475

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

sbalmos
Copy link
Contributor

@sbalmos sbalmos commented Feb 19, 2025

Summary

This adds support for vhost-style S3 bucket addressing in the aws_s3 source, equivalent to the support added in the aws_s3 sink in #21999.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

How did you test this PR?

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Checklist

  • Please read our Vector contributor resources.
    • make check-all is a good command to run locally. This check is
      defined here. Some of these
      checks might not be relevant to your PR. For Rust changes, at the very least you should run:
      • cargo fmt --all
      • cargo clippy --workspace --all-targets -- -D warnings
      • cargo nextest run --workspace (alternatively, you can run cargo test --all)
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run dd-rust-license-tool write to regenerate the license inventory and commit the changes (if any). More details here.

References

#21999

@sbalmos sbalmos requested review from a team as code owners February 19, 2025 16:51
@github-actions github-actions bot added domain: sources Anything related to the Vector's sources domain: external docs Anything related to Vector's external, public documentation labels Feb 19, 2025
///
/// This controls if the bucket name is in the hostname or part of the URL.
#[serde(default = "crate::serde::default_true")]
pub force_path_style: bool,
Copy link
Member

@pront pront Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It would be cleaner if this was an enum. I missed this #21999.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example:

/// The Amazon OpenSearch service type, either managed or serverless; primarily, selects the
/// correct AWS service to use when calculating the AWS v4 signature + disables features
/// unsupported by serverless: Elasticsearch API version autodetection, health checks
#[configurable_component]
#[derive(Clone, Debug, Eq, PartialEq)]
#[serde(deny_unknown_fields, rename_all = "lowercase")]
pub enum OpenSearchServiceType {
/// Elasticsearch or OpenSearch Managed domain
Managed,
/// OpenSearch Serverless collection
Serverless,
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was going to say, it's to be consistent with what the sink does. Do you want me to cross-implement in both the source and sink?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can accept as is (with the bool) and in the future deprecate all the bools and introduce enums. You don't have to do the later, but it would be greatly appreciated if you did :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave this change for the next version round, since the bool version was already approved for the sink, and the bool version for the source is approved here. I think if things are okay with the first version, I'll concurrently change the default on both the sink and source to use vhost format. It'd be a breaking change anyway to switch from bool to an enum, so changing the default from path to vhost is about the same level I'd think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree to keep this as is. Note here, that we can avoid breaking the behavior here by introducing new fields and deprecate the old ones. See https://github.com/vectordotdev/vector/blob/master/docs/DEPRECATION.md

Copy link
Contributor

@buraizu buraizu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with some suggested updates

Co-authored-by: Bryce Eadie <bryce.eadie@datadoghq.com>
@pront pront changed the title feat(aws_s3 source) Support vhost-style S3 bucket addressing feat(aws_s3 source): Support vhost-style S3 bucket addressing Feb 26, 2025
Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbalmos
Copy link
Contributor Author

sbalmos commented Feb 26, 2025

The spell checker seems to be hung up on my username in the authors of the changelog fragment file, and the use of the word vhosts in the filename. Not sure if I have rights to also push a commit that adds to the exclusions?

@pront pront enabled auto-merge February 26, 2025 19:38
@sbalmos
Copy link
Contributor Author

sbalmos commented Feb 26, 2025

Heh, and now the origin update, removing shannon, broke spellchecker on that word. :(

auto-merge was automatically disabled February 27, 2025 03:24

Head branch was pushed to by a user without write access

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: external docs Anything related to Vector's external, public documentation domain: sources Anything related to the Vector's sources
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants